Skip to content

⚡ refactor encode engine to use iterative frame-based traversal with identity-based cycle detection#43

Merged
techouse merged 3 commits intomainfrom
chore/optimizations-2
Feb 25, 2026
Merged

⚡ refactor encode engine to use iterative frame-based traversal with identity-based cycle detection#43
techouse merged 3 commits intomainfrom
chore/optimizations-2

Conversation

@techouse
Copy link
Owner

@techouse techouse commented Feb 24, 2026

Summary

This PR improves encode performance in qs.py while preserving public API and behavior, keeping Python 3.8+ compatibility, and adding no runtime dependencies.

What Changed

  • Optimized hot-path runtime checks in encode traversal and utility helpers.
  • Refactored cycle tracking in _encode from WeakWrapper/WeakKeyDictionary traversal to identity-based (id(obj)) bookkeeping.
  • Preserved legacy side-channel compatibility behavior.
  • Added internal key-path caching via KeyPathNode (lazy materialization + cached dot-encoded view).
  • Reduced per-frame allocation overhead in iterative encode traversal.
  • Added focused regression/internal tests for newly introduced helper branches.
  • Restored total coverage to 100%.

Public API / Behavior

  • No public API changes.
  • No intended behavior changes for encode, decode, options, sentinels, or aliases.
  • WeakWrapper remains intact for compatibility and existing tests.

Benchmark Results (local)

CPython 3.14.3 medians (scripts/bench_encode_depth.py):

Depth Before After Delta
2000 0.015277 0.007252 -52.5%
5000 0.040634 0.018116 -55.4%
12000 0.105413 0.044688 -57.6%

Validation

  • pytest -q tests passed.
  • pytest -q --cov=src/qs_codec --cov-report=term-missing passed with TOTAL 100%.
  • tests/comparison/compare_outputs.sh reports identical output vs Node reference.
  • tox -e linters passed.
  • tox -e python3.14 passed.

Summary by CodeRabbit

  • Refactor

    • Encoding internals rewritten to be path-aware and iterative, improving nested-key and sequence path handling, comma-list semantics, and empty-list emission while preserving the public encode/dumps API.
    • Added cached path materialization to speed up path-related operations.
  • Bug Fixes

    • Enforce strict-null RFC formatting for bare keys.
    • Prevent mutation of caller-owned root lists/tuples when using callable filters.
  • Tests

    • Added unit tests covering path handling, list-format generators, internal helpers, and encoding edge cases.

@techouse techouse self-assigned this Feb 24, 2026
@techouse techouse added the enhancement New feature or request label Feb 24, 2026
@coderabbitai
Copy link

coderabbitai bot commented Feb 24, 2026

📝 Walkthrough

Walkthrough

Refactors encoder internals to an iterative, path-aware frame engine using a new KeyPathNode for hierarchical key paths; replaces WeakWrapper-based cycle tracking with identity-key cycle state; makes sequence child-path generation generator-aware; switches mapping checks to collections.abc.Mapping; adds tests and changelog. Public API (encode/dumps) unchanged.

Changes

Cohort / File(s) Summary
Path Abstraction
src/qs_codec/models/key_path_node.py
Adds KeyPathNode: immutable linked path nodes with lazy materialize() and as_dot_encoded() caching, iterative dot-encoding, and append/from_materialized helpers.
Encoder Core
src/qs_codec/encode.py
Large internal rewrite: introduces _EncodeFrame iterative traversal with phases (START/ITERATE/AWAIT_CHILD), per-frame path/state (is_mapping/is_sequence/undefined), _MISSING sentinel, identity-based cycle keys, per-key path derivation (including _next_path_for_sequence), and list-format-aware child suffixing. Public encode/dumps surface preserved.
Utilities
src/qs_codec/utils/utils.py
Replaces Mapping checks with collections.abc.Mapping (ABCMapping) and adjusts merge/primitive-detection branches to treat abstract mappings consistently.
Tests — Encode Helpers
tests/unit/encode_internal_helpers_test.py
Adds tests for _identity_key and _next_path_for_sequence behavior with custom list-format suffix generators.
Tests — KeyPathNode
tests/unit/key_path_node_test.py
Adds tests for append/materialize/as_dot_encoded caching, identity reuse, and deep-path robustness.
Tests — ListFormat
tests/unit/list_format_test.py
Adds parametrized tests for ListFormatGenerator variants: brackets, comma, indices, repeat.
Tests — Encoder Behavior
tests/unit/encode_test.py
Adds RFC1738 space-encoding case and tests ensuring callable filters don't mutate caller-owned root lists/tuples.
Changelog
CHANGELOG.md
Adds 1.4.3-wip entries documenting the refactor, KeyPathNode, fixes (deep-dot encoding, strict-null RFC handling), and test expansion.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant API as encode()/dumps
  participant Engine as _EncodeFrameEngine
  participant Path as KeyPathNode
  participant Cycle as CycleState
  participant ListGen as ListFormatGenerator
  participant Output as TokenCollector

  Caller->>API: call encode(obj, options)
  API->>Engine: init root frame
  Engine->>Path: create root KeyPathNode
  Engine->>Engine: process frame (START → ITERATE → AWAIT_CHILD)
  Engine->>ListGen: request suffix for sequence child (if sequence)
  ListGen-->>Engine: return suffix
  Engine->>Path: append segment -> child KeyPathNode
  Engine->>Cycle: compute identity_key and check/push
  Cycle-->>Engine: cycle state result
  Engine->>Output: emit token(s) using KeyPathNode.materialize()/as_dot_encoded()
  Output-->>API: aggregated tokens
  API-->>Caller: return encoded string
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐇
I hop through frames and stitch each key,
Dots tucked safe, paths grown carefully,
Cycles sniffed and gently bound,
Tokens tumble, ordered, sound—
A tiny hop, encoded free ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 31.71% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ⚠️ Warning The PR description lacks required sections from the template: Type of change checkboxes, Testing description, and Checklist items are not provided. Complete the PR description by adding the 'Type of change' section with appropriate selections, 'How Has This Been Tested?' with test details, and a filled Checklist confirming code quality standards and test validation.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and concisely summarizes the main change: refactoring the encode engine to use iterative frame-based traversal with identity-based cycle detection. It is specific, directly related to the primary objective, and avoids vague or misleading language.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch chore/optimizations-2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@techouse techouse changed the title Chore/optimizations 2 ⚡ improve encode performance Feb 24, 2026
@codacy-production
Copy link

codacy-production bot commented Feb 24, 2026

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
+0.00% (target: -1.00%) 100.00%
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (3c7db66) 1436 1436 100.00%
Head commit (1c73320) 1576 (+140) 1576 (+140) 100.00% (+0.00%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#43) 220 220 100.00%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (7)
src/qs_codec/encode.py (3)

171-171: _MISSING is defined after the encode function that references it (Line 106).

missing = _MISSING  # line 106 – inside encode()
...
_MISSING = object()  # line 171 – module level

This is safe at runtime (Python resolves names at call time, not definition time), but the forward reference is non-idiomatic: readers scanning top-to-bottom encounter _MISSING used before it is defined, and static tools may flag it. Moving _MISSING above encode (alongside _sentinel) eliminates the confusion.

♻️ Proposed fix
+_MISSING = object()
+
 # Unique placeholder used as a key within the side-channel chain to pass context down recursion.
 _sentinel: WeakWrapper = WeakWrapper({})
 MAX_ENCODING_DEPTH_EXCEEDED = "Maximum encoding depth exceeded"
-
-_MISSING = object()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/encode.py` at line 171, Move the module-level sentinel _MISSING
so it is defined before the encode function that references it: locate the
encode(...) function (which assigns missing = _MISSING) and relocate the line
defining _MISSING = object() to the top of the module near the existing
_sentinel (or otherwise above encode). This ensures _MISSING is defined before
use and avoids forward-reference confusion for readers and static tools.

536-541: Redundant None-guard after unconditional assignment (lines 539-541).

frame.path is unconditionally assigned at Line 539; the if frame.path is None guard on Line 540 can never be True. The # pragma: no cover annotation acknowledges this, but the dead branch adds noise. If the invariant is important to document, an assert or a single comment is cleaner.

♻️ Suggested cleanup
 frame.path = KeyPathNode.from_materialized(frame.prefix)
-if frame.path is None:  # pragma: no cover - internal invariant
-    raise RuntimeError("path is not initialized")  # noqa: TRY003
 current_path = frame.path
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/encode.py` around lines 536 - 541, The second None-guard after
unconditionally assigning frame.path is dead code; remove the redundant if block
checking frame.path and the RuntimeError, and instead document the invariant
with an assertion or comment (e.g., assert frame.path is not None) right after
the assignment where frame.path is set (reference symbols: frame.path,
frame.prefix, KeyPathNode.from_materialized, frame.add_query_prefix). Ensure
tests/coverage reflect the simpler flow.

187-225: Sort _EncodeFrame.__slots__ to satisfy Ruff RUF023.

Ruff reports _EncodeFrame.__slots__ is not sorted (RUF023). Applying a natural (alphabetical) sort resolves the warning.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/encode.py` around lines 187 - 225, The __slots__ tuple on class
_EncodeFrame is not alphabetically ordered which triggers Ruff RUF023; open the
_EncodeFrame class and reorder the entries in the __slots__ tuple (reference:
_EncodeFrame.__slots__) into natural/alphabetical order (maintain all existing
names and commas, preserving the tuple structure) so the list is sorted and the
linter warning is resolved.
tests/unit/encode_internal_helpers_test.py (1)

36-37: Add # noqa: ARG001 to suppress the Ruff unused-argument warning.

# pylint: disable=W0613 silences pylint but Ruff's ARG001 is still reported on the prefix parameter.

♻️ Proposed fix
-    def custom_generator(prefix: str, key: t.Optional[str]) -> str:  # pylint: disable=W0613
+    def custom_generator(prefix: str, key: t.Optional[str]) -> str:  # pylint: disable=W0613  # noqa: ARG001
         return f"other[{key}]"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/encode_internal_helpers_test.py` around lines 36 - 37, The test
helper function custom_generator currently disables pylint for the unused
parameter but still triggers Ruff's ARG001 for the unused prefix parameter;
update the function signature for custom_generator(prefix: str, key:
t.Optional[str]) to add "# noqa: ARG001" alongside the existing pylint disable
so Ruff won't flag the unused argument, keeping the existing behavior that only
key is used.
src/qs_codec/models/key_path_node.py (2)

9-9: Docstring says "Immutable" but dot_encoded and materialized slots are mutated as caches.

The class is immutable in the domain-sense (path semantics don't change after construction), but the slots are write-once lazy caches. Adjusting the docstring avoids confusion.

♻️ Suggested wording
-    """Immutable key-path node with lazy materialization and dot-encoded caching."""
+    """Key-path node with stable path semantics and lazily-populated materialization and dot-encoded caches."""
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/models/key_path_node.py` at line 9, Update the module/class
docstring to remove the absolute claim of "Immutable" and state that KeyPathNode
represents immutable path semantics but uses write-once lazy caches: mention
that dot_encoded and materialized are lazily populated cached slots (written
once) rather than mutated for semantics; keep the intent that path semantics do
not change after construction but clarify the caching behavior for dot_encoded
and materialized in the docstring for the KeyPathNode class.

11-11: Sort __slots__ to satisfy Ruff RUF023.

♻️ Proposed fix
-    __slots__ = ("parent", "segment", "depth", "total_length", "dot_encoded", "materialized")
+    __slots__ = ("depth", "dot_encoded", "materialized", "parent", "segment", "total_length")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/models/key_path_node.py` at line 11, The __slots__ tuple in
class KeyPathNode is not sorted, triggering Ruff RUF023; update the __slots__
declaration (the "__slots__" symbol in key_path_node.py within class
KeyPathNode) to be a deterministically sorted tuple of attribute names (e.g.,
alphabetically) so that ("dot_encoded", "depth", "materialized", "parent",
"segment", "total_length") replaces the current unsorted list; ensure the new
order is used wherever __slots__ is referenced.
src/qs_codec/utils/utils.py (1)

618-620: return isinstance(val, object) is always True; consider spelling it explicitly.

After eliminating all known non-primitive types, every surviving value is necessarily an object, making the isinstance check a tautology. While the intent (treat opaque custom types as primitives) is correct, stating it explicitly improves readability.

♻️ Suggested clarification
-        if isinstance(val, (list, tuple, ABCMapping)):
-            return False
-        return isinstance(val, object)
+        if isinstance(val, (list, tuple, ABCMapping)):
+            return False
+        return True  # any remaining opaque object is treated as a primitive
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/utils/utils.py` around lines 618 - 620, The final
isinstance(val, object) check is tautological; replace it with an explicit True
return and a short comment clarifying intent (e.g., "opaque custom types are
treated as primitives") so the branch reads clearly after the earlier
isinstance(val, (list, tuple, ABCMapping)) check — update the conditional block
around val to return True (with the comment) instead of using isinstance(val,
object).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/qs_codec/encode.py`:
- Line 171: Move the module-level sentinel _MISSING so it is defined before the
encode function that references it: locate the encode(...) function (which
assigns missing = _MISSING) and relocate the line defining _MISSING = object()
to the top of the module near the existing _sentinel (or otherwise above
encode). This ensures _MISSING is defined before use and avoids
forward-reference confusion for readers and static tools.
- Around line 536-541: The second None-guard after unconditionally assigning
frame.path is dead code; remove the redundant if block checking frame.path and
the RuntimeError, and instead document the invariant with an assertion or
comment (e.g., assert frame.path is not None) right after the assignment where
frame.path is set (reference symbols: frame.path, frame.prefix,
KeyPathNode.from_materialized, frame.add_query_prefix). Ensure tests/coverage
reflect the simpler flow.
- Around line 187-225: The __slots__ tuple on class _EncodeFrame is not
alphabetically ordered which triggers Ruff RUF023; open the _EncodeFrame class
and reorder the entries in the __slots__ tuple (reference:
_EncodeFrame.__slots__) into natural/alphabetical order (maintain all existing
names and commas, preserving the tuple structure) so the list is sorted and the
linter warning is resolved.

In `@src/qs_codec/models/key_path_node.py`:
- Line 9: Update the module/class docstring to remove the absolute claim of
"Immutable" and state that KeyPathNode represents immutable path semantics but
uses write-once lazy caches: mention that dot_encoded and materialized are
lazily populated cached slots (written once) rather than mutated for semantics;
keep the intent that path semantics do not change after construction but clarify
the caching behavior for dot_encoded and materialized in the docstring for the
KeyPathNode class.
- Line 11: The __slots__ tuple in class KeyPathNode is not sorted, triggering
Ruff RUF023; update the __slots__ declaration (the "__slots__" symbol in
key_path_node.py within class KeyPathNode) to be a deterministically sorted
tuple of attribute names (e.g., alphabetically) so that ("dot_encoded", "depth",
"materialized", "parent", "segment", "total_length") replaces the current
unsorted list; ensure the new order is used wherever __slots__ is referenced.

In `@src/qs_codec/utils/utils.py`:
- Around line 618-620: The final isinstance(val, object) check is tautological;
replace it with an explicit True return and a short comment clarifying intent
(e.g., "opaque custom types are treated as primitives") so the branch reads
clearly after the earlier isinstance(val, (list, tuple, ABCMapping)) check —
update the conditional block around val to return True (with the comment)
instead of using isinstance(val, object).

In `@tests/unit/encode_internal_helpers_test.py`:
- Around line 36-37: The test helper function custom_generator currently
disables pylint for the unused parameter but still triggers Ruff's ARG001 for
the unused prefix parameter; update the function signature for
custom_generator(prefix: str, key: t.Optional[str]) to add "# noqa: ARG001"
alongside the existing pylint disable so Ruff won't flag the unused argument,
keeping the existing behavior that only key is used.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3c7db66 and 2da9be2.

📒 Files selected for processing (6)
  • src/qs_codec/encode.py
  • src/qs_codec/models/key_path_node.py
  • src/qs_codec/utils/utils.py
  • tests/unit/encode_internal_helpers_test.py
  • tests/unit/key_path_node_test.py
  • tests/unit/list_format_test.py

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR focuses on improving qs_codec.encode performance by reducing allocation and hot-path overhead during iterative traversal, while keeping public behavior/API stable and maintaining Python 3.8+ compatibility.

Changes:

  • Refactors encoder traversal to use KeyPathNode for lazy/cached key-path materialization (including dot-encoded views).
  • Reworks cycle detection bookkeeping to identity-based (id(obj)) lookups with legacy side-channel bootstrap retained for internal compatibility.
  • Adds targeted unit tests for list-format generators, KeyPathNode, and new internal encode helpers; updates merge utilities to use collections.abc.Mapping for runtime checks.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/qs_codec/encode.py Performance-oriented refactor of iterative encoding (path handling, cycle tracking, generator fast-paths).
src/qs_codec/models/key_path_node.py Introduces KeyPathNode to reduce string churn via lazy materialization and cached dot-encoding.
src/qs_codec/utils/utils.py Adjusts mapping runtime checks to collections.abc.Mapping and simplifies primitive checks.
tests/unit/list_format_test.py Adds coverage for list-format generator behavior.
tests/unit/key_path_node_test.py Adds coverage for KeyPathNode caching/materialization and dot-encoding behavior.
tests/unit/encode_internal_helpers_test.py Adds coverage for _identity_key and _next_path_for_sequence helper branches.

@techouse techouse requested a review from Copilot February 24, 2026 22:15
@techouse
Copy link
Owner Author

@codex review

@techouse techouse changed the title ⚡ improve encode performance ⚡ refactor encode engine to use iterative frame-based traversal with identity-based cycle detection Feb 24, 2026
@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Chef's kiss.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/qs_codec/encode.py (1)

68-72: ⚠️ Potential issue | 🟠 Major

Deep-copy sequences when a callable filter is used to prevent input mutation.
With a callable filter, list/tuple values are projected without a deepcopy, so a mutating filter can change caller-owned elements. That violates the module’s no-mutation guarantee and the sequence-handling guideline.

🔧 Suggested fix
-    elif isinstance(value, (list, tuple)):
-        obj = {str(i): item for i, item in enumerate(value)}
+    elif isinstance(value, (list, tuple)):
+        seq = deepcopy(value) if callable(filter_opt) else value
+        obj = {str(i): item for i, item in enumerate(seq)}

As per coding guidelines, "Do not mutate caller inputs—copy/normalize before traversal (shallow copy for mappings; deep-copy only when a callable filter may mutate; index-projection for sequences)".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/encode.py` around lines 68 - 72, The sequence branch currently
creates obj = {str(i): item for i, item in enumerate(value)} without
deep-copying when filter_opt is callable, which allows a mutating filter to
modify caller-owned elements; update the list/tuple handling in encode.py so
that when isinstance(value, (list, tuple)) you deep-copy the sequence if
callable(filter_opt) before projecting to the index-keyed dict (use
deepcopy(value) vs value) so traversal/filtering cannot mutate the original
input while keeping the index-projection logic intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/qs_codec/encode.py`:
- Around line 68-72: The sequence branch currently creates obj = {str(i): item
for i, item in enumerate(value)} without deep-copying when filter_opt is
callable, which allows a mutating filter to modify caller-owned elements; update
the list/tuple handling in encode.py so that when isinstance(value, (list,
tuple)) you deep-copy the sequence if callable(filter_opt) before projecting to
the index-keyed dict (use deepcopy(value) vs value) so traversal/filtering
cannot mutate the original input while keeping the index-projection logic
intact.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2da9be2 and 1b7b7c9.

📒 Files selected for processing (5)
  • src/qs_codec/encode.py
  • src/qs_codec/models/key_path_node.py
  • src/qs_codec/utils/utils.py
  • tests/unit/encode_internal_helpers_test.py
  • tests/unit/encode_test.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/qs_codec/utils/utils.py
  • tests/unit/encode_internal_helpers_test.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/qs_codec/encode.py (2)

107-114: _MISSING sentinel used before definition.

_MISSING is referenced at line 107 (missing = _MISSING) but defined at line 172. This works at runtime because _MISSING is a module-level variable and encode() is only called after module load completes, but it's a readability concern—the definition sits after the function that uses it, separated by ~65 lines. Consider moving _MISSING = object() to just above _sentinel (before line 175) or to the top of the module near other module-level constants, which is already where it logically belongs alongside _sentinel and MAX_ENCODING_DEPTH_EXCEEDED.

That said, the current placement still works correctly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/encode.py` around lines 107 - 114, The encode() function
references the module-level sentinel _MISSING before its definition; move the
declaration _MISSING = object() so it appears with the other module-level
constants (near _sentinel and MAX_ENCODING_DEPTH_EXCEEDED) before the encode()
function, ensuring _MISSING is defined prior to any references in encode() and
related helpers.

185-330: Per-frame duplication of invariant encoding options.

Every _EncodeFrame carries ~20 option fields (encoder, serialize_date, sort, filter_, formatter, format, generate_array_prefix, allow_empty_lists, strict_null_handling, skip_nulls, encode_dot_in_keys, allow_dots, encode_values_only, charset, add_query_prefix, comma_round_trip, comma_compact_nulls) that are identical across all frames in a single _encode call. For a depth-12 000 structure, that's 12 000 redundant copies of the same references.

Consider extracting the immutable options into a shared context object created once, and having each frame hold only per-node state (value, depth, path, obj, obj_keys, phase, cycle fields, etc.) plus a reference to the shared context. This would reduce per-frame allocation significantly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/encode.py` around lines 185 - 330, The _EncodeFrame currently
stores ~20 immutable encoding options per instance causing huge duplication;
extract those into a single shared context object (e.g., new class
EncodeContext) created once at the start of _encode and referenced from each
_EncodeFrame instead of copying fields like encoder, serialize_date, sort,
filter_, formatter, format, generate_array_prefix, allow_empty_lists,
strict_null_handling, skip_nulls, encode_dot_in_keys, allow_dots,
encode_values_only, charset, add_query_prefix, comma_round_trip,
comma_compact_nulls; change _EncodeFrame.__init__ to accept a context param and
remove those option attributes from per-frame slots, update all places that
currently access frame.encoder / frame.format / etc. to use
frame.context.encoder (or similar), and ensure the context is immutable after
creation so frames only carry per-node state (value, depth, path, obj, obj_keys,
phase, cycle_*).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/qs_codec/encode.py`:
- Around line 107-114: The encode() function references the module-level
sentinel _MISSING before its definition; move the declaration _MISSING =
object() so it appears with the other module-level constants (near _sentinel and
MAX_ENCODING_DEPTH_EXCEEDED) before the encode() function, ensuring _MISSING is
defined prior to any references in encode() and related helpers.
- Around line 185-330: The _EncodeFrame currently stores ~20 immutable encoding
options per instance causing huge duplication; extract those into a single
shared context object (e.g., new class EncodeContext) created once at the start
of _encode and referenced from each _EncodeFrame instead of copying fields like
encoder, serialize_date, sort, filter_, formatter, format,
generate_array_prefix, allow_empty_lists, strict_null_handling, skip_nulls,
encode_dot_in_keys, allow_dots, encode_values_only, charset, add_query_prefix,
comma_round_trip, comma_compact_nulls; change _EncodeFrame.__init__ to accept a
context param and remove those option attributes from per-frame slots, update
all places that currently access frame.encoder / frame.format / etc. to use
frame.context.encoder (or similar), and ensure the context is immutable after
creation so frames only carry per-node state (value, depth, path, obj, obj_keys,
phase, cycle_*).

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1b7b7c9 and c1570de.

📒 Files selected for processing (2)
  • src/qs_codec/encode.py
  • tests/unit/encode_test.py

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated no new comments.

@techouse techouse force-pushed the chore/optimizations-2 branch from c1570de to 2702277 Compare February 24, 2026 23:00
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/qs_codec/encode.py (1)

689-714: ⚠️ Potential issue | 🟡 Minor

Broad except Exception at line 712 silently swallows unexpected errors.

The catch-all at line 712 handles index/key errors from value extraction, but it also swallows any unexpected exception (e.g., a broken __getitem__ on a custom mapping raising RuntimeError). Consider narrowing to (KeyError, IndexError, TypeError) to avoid masking real bugs in user-provided containers.

Narrower exception handling
-                except Exception:  # noqa: BLE001  # pylint: disable=W0718
+                except (KeyError, IndexError, TypeError):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/encode.py` around lines 689 - 714, The current broad "except
Exception" in the value-extraction block swallows unexpected errors when
computing _value from frame.obj for keys like _key; change the exception guard
to only catch expected lookup errors (e.g., except (KeyError, IndexError,
TypeError):) so that KeyError/IndexError/TypeError set _value = None and
_value_undefined = True while letting other exceptions (e.g., RuntimeError from
a broken __getitem__) propagate; update the try/except around the
frame.is_mapping / frame.is_sequence / frame.obj[_key] logic that assigns _value
and _value_undefined accordingly.
🧹 Nitpick comments (7)
src/qs_codec/encode.py (6)

338-346: _identity_key returns raw int for int inputs — verify no collision with id() values.

When value is an int (i.e., an id() result), the function returns it directly. When value is anything else, it returns id(value). Since all call-sites pass id(obj) (already an int), the int-passthrough path is always taken. This is consistent, but the function signature accepts t.Any which could lead to confusion if someone passes a non-id int in the future. A brief docstring noting "expects id()-derived ints" would help.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/encode.py` around lines 338 - 346, Add a concise docstring to
_identity_key explaining that it expects ints that are raw id() values (and
therefore returns ints unchanged), describes the WeakWrapper branch behavior
(returns id(value.value) or id(value) on ReferenceError), and warns callers not
to pass arbitrary non-id ints; optionally add a short runtime assertion (or
comment) in _identity_key that callers should pass id(obj) to avoid future
confusion.

768-772: Implicit "await_child" phase — add an explicit label check or comment.

The final else branch (lines 768-772) handles returning from a child frame. Unlike other phases, it lacks a frame.phase == "await_child" guard. While structurally sound (it's the only remaining case), an explicit check would make the state machine self-documenting and guard against future phase additions.

Explicit phase guard
-        if isinstance(last_result, (list, tuple)):
+        if frame.phase != "await_child":  # pragma: no cover - internal invariant
+            raise RuntimeError(f"Unexpected phase: {frame.phase}")  # noqa: TRY003
+
+        if isinstance(last_result, (list, tuple)):
             frame.values.extend(last_result)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/encode.py` around lines 768 - 772, The code handling
return-from-child currently treats any non-iterable last_result as the
"await_child" case without an explicit guard; update the block that appends
last_result to first check that frame.phase == "await_child" (or add an explicit
comment plus an assertion) before extending/appending to frame.values and
setting frame.phase = "iterate" so the state machine is self-documenting and
robust against new phases; use the existing symbols frame.phase, last_result,
frame.values and set frame.phase = "iterate" only when the explicit check
passes.

172-172: _MISSING is defined after encode but used inside it — works at runtime but is fragile to reordering.

_MISSING at line 172 is referenced at line 107 inside encode(). This works because the function body executes at call-time, not definition-time — but it's a subtle ordering dependency. Consider moving _MISSING = object() above the encode function (e.g. next to other module-level constants) to make the dependency obvious and avoid breakage if someone reorders definitions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/encode.py` at line 172, The _MISSING sentinel is defined after
the encode function but referenced inside encode (symbol: encode uses _MISSING),
creating a fragile ordering dependency; move the module-level definition
_MISSING = object() so it appears before the encode function (place it with the
other module constants) to ensure the sentinel is always available when encode
is defined/used and avoid breakage if definitions are reordered.

586-591: Redundant isinstance(obj, bool) check in the non-encoder ternary.

The ternary at lines 588-590 double-checks isinstance(obj, bool). This is functionally correct but slightly hard to read. At this code point, Utils.is_non_nullish_primitive(obj, ...) already returned True, so obj is a known primitive — but it could be any primitive type, hence the bool guard is needed. The logic is correct; just noting the readability could be improved.

Clearer alternative
-                    value_str = (
-                        "true" if isinstance(obj, bool) and obj else "false" if isinstance(obj, bool) else str(obj)
-                    )
+                    if isinstance(obj, bool):
+                        value_str = "true" if obj else "false"
+                    else:
+                        value_str = str(obj)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/encode.py` around lines 586 - 591, The code repeats
isinstance(obj, bool) inside the nested ternary when building value_str;
simplify by replacing the nested ternary with a single boolean check: compute
value_str as "true" if isinstance(obj, bool) and obj else "false" if
isinstance(obj, bool) else str(obj) should be refactored to a single conditional
that checks isinstance(obj, bool) once (e.g., if isinstance(obj, bool):
value_str = "true" if obj else "false" else: value_str = str(obj)) and then use
frame.formatter(key_text) and frame.formatter(value_str) when populating
result_tokens (refer to frame.formatter, key_text, value_str, result_tokens).

374-393: Cycle-detection tuple unpacking could benefit from a named structure.

nearest is a Tuple[int, int, bool] (level, pos, is_top), accessed via nearest[0], nearest[1], nearest[2]. Using a NamedTuple or at least local unpacking (level, pos, is_top = nearest) would improve readability of the distance/step comparison logic.

Readability improvement
     nearest = entries[-1]
-    distance = current_level - nearest[0]
-    if nearest[1] == distance:
+    ancestor_level, ancestor_pos, is_top = nearest
+    distance = current_level - ancestor_level
+    if ancestor_pos == distance:
         raise ValueError("Circular reference detected")  # noqa: TRY003
 
-    return 0 if nearest[2] else distance
+    return 0 if is_top else distance
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/encode.py` around lines 374 - 393, The tuple `nearest` in
_compute_step_and_check_cycle (from _CycleState.entries) is accessed by index;
unpack it into meaningful names (e.g., ancestor_level, ancestor_pos, is_top) or
replace the stored tuple type with a NamedTuple/Dataclass (e.g., CycleEntry) and
use its fields, then compute distance = current_level - ancestor_level, compare
ancestor_pos to distance, raise the same ValueError if equal, and return 0 if
is_top else distance; keep references to _identity_key and state.entries
unchanged so the behavior is identical.

185-330: Frame class is correct but very large — consider a builder or child-frame factory.

The _EncodeFrame has 37 slots and a 28-parameter __init__. Most parameters are "inherited" config that propagates unchanged from parent to child. A spawn_child(...) method on the frame that copies config and only accepts the few fields that change (value, path, depth, etc.) would reduce the 35-line child-frame construction at lines 731-764 and lower the risk of accidentally diverging a parameter.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/encode.py` around lines 185 - 330, The _EncodeFrame constructor
is being repeatedly called with many unchanged config parameters when creating
child frames; add a spawn_child method on class _EncodeFrame that clones all
invariant configuration attributes (encoder, serialize_date, sort, filter_,
formatter, format, generate_array_prefix, allow_empty_lists,
strict_null_handling, skip_nulls, encode_dot_in_keys, allow_dots,
encode_values_only, charset, add_query_prefix, max_depth, comma_round_trip,
comma_compact_nulls, side_channel, etc.) and returns a new _EncodeFrame
accepting only the changing fields (value, path, depth, is_undefined, prefix,
obj/obj_id/obj_keys/values/index/phase as needed, cycle_state, cycle_level,
adjusted_path); then replace the long direct calls to _EncodeFrame(...) that
build child frames with frame.spawn_child(...) so callers only pass the small
set of differing arguments and reduce duplication/risk of diverging parameters.
src/qs_codec/utils/utils.py (1)

618-623: isinstance(val, object) always returns True — consider simplifying.

Line 623's return isinstance(val, object) is tautological in Python: every value is an instance of object. The comment explains it's kept for monkeypatch compatibility, which is a valid (if niche) reason. However, a plain return True would be clearer and equivalent in all non-monkeypatched scenarios.

Simplification
-        # Opaque custom types are treated as primitives; keep the explicit fallback
-        # check for compatibility with tests that monkeypatch `isinstance`.
-        return isinstance(val, object)
+        # Opaque custom types are treated as primitives.
+        return True
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/utils/utils.py` around lines 618 - 623, The final return uses
the tautological expression `isinstance(val, object)` which is always True;
replace that expression with a plain `True` to make intent clear while
preserving the surrounding comment about monkeypatch compatibility and the
earlier checks for lists/tuples/ABCMapping; update the return statement that
currently reads `return isinstance(val, object)` to `return True` so the
function that inspects `val` behaves identically but is clearer.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/qs_codec/encode.py`:
- Around line 689-714: The current broad "except Exception" in the
value-extraction block swallows unexpected errors when computing _value from
frame.obj for keys like _key; change the exception guard to only catch expected
lookup errors (e.g., except (KeyError, IndexError, TypeError):) so that
KeyError/IndexError/TypeError set _value = None and _value_undefined = True
while letting other exceptions (e.g., RuntimeError from a broken __getitem__)
propagate; update the try/except around the frame.is_mapping / frame.is_sequence
/ frame.obj[_key] logic that assigns _value and _value_undefined accordingly.

---

Nitpick comments:
In `@src/qs_codec/encode.py`:
- Around line 338-346: Add a concise docstring to _identity_key explaining that
it expects ints that are raw id() values (and therefore returns ints unchanged),
describes the WeakWrapper branch behavior (returns id(value.value) or id(value)
on ReferenceError), and warns callers not to pass arbitrary non-id ints;
optionally add a short runtime assertion (or comment) in _identity_key that
callers should pass id(obj) to avoid future confusion.
- Around line 768-772: The code handling return-from-child currently treats any
non-iterable last_result as the "await_child" case without an explicit guard;
update the block that appends last_result to first check that frame.phase ==
"await_child" (or add an explicit comment plus an assertion) before
extending/appending to frame.values and setting frame.phase = "iterate" so the
state machine is self-documenting and robust against new phases; use the
existing symbols frame.phase, last_result, frame.values and set frame.phase =
"iterate" only when the explicit check passes.
- Line 172: The _MISSING sentinel is defined after the encode function but
referenced inside encode (symbol: encode uses _MISSING), creating a fragile
ordering dependency; move the module-level definition _MISSING = object() so it
appears before the encode function (place it with the other module constants) to
ensure the sentinel is always available when encode is defined/used and avoid
breakage if definitions are reordered.
- Around line 586-591: The code repeats isinstance(obj, bool) inside the nested
ternary when building value_str; simplify by replacing the nested ternary with a
single boolean check: compute value_str as "true" if isinstance(obj, bool) and
obj else "false" if isinstance(obj, bool) else str(obj) should be refactored to
a single conditional that checks isinstance(obj, bool) once (e.g., if
isinstance(obj, bool): value_str = "true" if obj else "false" else: value_str =
str(obj)) and then use frame.formatter(key_text) and frame.formatter(value_str)
when populating result_tokens (refer to frame.formatter, key_text, value_str,
result_tokens).
- Around line 374-393: The tuple `nearest` in _compute_step_and_check_cycle
(from _CycleState.entries) is accessed by index; unpack it into meaningful names
(e.g., ancestor_level, ancestor_pos, is_top) or replace the stored tuple type
with a NamedTuple/Dataclass (e.g., CycleEntry) and use its fields, then compute
distance = current_level - ancestor_level, compare ancestor_pos to distance,
raise the same ValueError if equal, and return 0 if is_top else distance; keep
references to _identity_key and state.entries unchanged so the behavior is
identical.
- Around line 185-330: The _EncodeFrame constructor is being repeatedly called
with many unchanged config parameters when creating child frames; add a
spawn_child method on class _EncodeFrame that clones all invariant configuration
attributes (encoder, serialize_date, sort, filter_, formatter, format,
generate_array_prefix, allow_empty_lists, strict_null_handling, skip_nulls,
encode_dot_in_keys, allow_dots, encode_values_only, charset, add_query_prefix,
max_depth, comma_round_trip, comma_compact_nulls, side_channel, etc.) and
returns a new _EncodeFrame accepting only the changing fields (value, path,
depth, is_undefined, prefix, obj/obj_id/obj_keys/values/index/phase as needed,
cycle_state, cycle_level, adjusted_path); then replace the long direct calls to
_EncodeFrame(...) that build child frames with frame.spawn_child(...) so callers
only pass the small set of differing arguments and reduce duplication/risk of
diverging parameters.

In `@src/qs_codec/utils/utils.py`:
- Around line 618-623: The final return uses the tautological expression
`isinstance(val, object)` which is always True; replace that expression with a
plain `True` to make intent clear while preserving the surrounding comment about
monkeypatch compatibility and the earlier checks for lists/tuples/ABCMapping;
update the return statement that currently reads `return isinstance(val,
object)` to `return True` so the function that inspects `val` behaves
identically but is clearer.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c1570de and 2702277.

📒 Files selected for processing (7)
  • src/qs_codec/encode.py
  • src/qs_codec/models/key_path_node.py
  • src/qs_codec/utils/utils.py
  • tests/unit/encode_internal_helpers_test.py
  • tests/unit/encode_test.py
  • tests/unit/key_path_node_test.py
  • tests/unit/list_format_test.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • tests/unit/encode_internal_helpers_test.py
  • tests/unit/list_format_test.py
  • tests/unit/key_path_node_test.py
  • tests/unit/encode_test.py
  • src/qs_codec/models/key_path_node.py

@codecov
Copy link

codecov bot commented Feb 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (3c7db66) to head (9dcc65d).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##              main       #43    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files           17        18     +1     
  Lines         1436      1588   +152     
==========================================
+ Hits          1436      1588   +152     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/qs_codec/encode.py (2)

185-330: Consider extracting immutable options into a shared context object.

Nearly all option fields (formatter, format, sort, skip_nulls, allow_dots, encode_dot_in_keys, charset, strict_null_handling, serialize_date, allow_empty_lists, etc.) are copied verbatim from parent to child frame. Pulling them into a single _EncodeContext dataclass (or __slots__ class) that every frame references would cut per-frame allocation by ~18 fields and simplify the _EncodeFrame.__init__ signature to about 8 per-frame fields (value, is_undefined, prefix, path, depth, cycle_state, cycle_level, encoder).

This is an optional follow-up — the __slots__ usage already keeps things lean, and benchmarks show a clear win.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/encode.py` around lines 185 - 330, Extract the many immutable
per-frame options into a shared _EncodeContext (preferably a dataclass or small
__slots__ class) and have _EncodeFrame hold a single reference to that context
instead of copying fields; update _EncodeFrame.__init__ to accept a context
parameter (containing formatter, format, sort, skip_nulls, allow_dots,
encode_dot_in_keys, charset, strict_null_handling, serialize_date,
allow_empty_lists, generate_array_prefix, comma_compact_nulls, comma_round_trip,
encoder, etc.) and remove those individual assignments from the frame, leaving
only truly per-frame state (value, is_undefined, prefix, path, depth, max_depth,
cycle_state, cycle_level, index/step/obj-related fields); ensure code paths that
previously constructed child frames copy or reuse the same _EncodeContext
instance when creating new _EncodeFrame objects so behavior is identical but
allocations are reduced.

540-553: Phase strings as magic constants on a hot path.

The three-way phase dispatch ("start", "iterate", "await_child") runs every iteration of the main while stack loop. String equality is cheap in CPython (interned short literals use pointer comparison), but using module-level int constants would make the intent explicit and eliminate any ambiguity across alternative runtimes.

_PHASE_START = 0
_PHASE_ITERATE = 1
_PHASE_AWAIT_CHILD = 2

This is a minor nit given CPython interns these already.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/encode.py` around lines 540 - 553, Introduce module-level
integer constants (e.g., _PHASE_START = 0, _PHASE_ITERATE = 1,
_PHASE_AWAIT_CHILD = 2) and replace all string-literal phase checks and
assignments around the main while stack loop with these constants (e.g., change
if frame.phase == "start" to if frame.phase == _PHASE_START). Update any places
that set frame.phase (or compare it elsewhere) to use the new constants, and
keep existing behavior for related symbols such as
frame.generate_array_prefix/_COMMA_GENERATOR, frame.depth/_get_max_encode_depth,
and KeyPathNode.from_materialized; ensure no other code relies on the string
values (update callers/tests if needed) so the phase dispatch uses the integer
constants consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/qs_codec/encode.py`:
- Around line 185-330: Extract the many immutable per-frame options into a
shared _EncodeContext (preferably a dataclass or small __slots__ class) and have
_EncodeFrame hold a single reference to that context instead of copying fields;
update _EncodeFrame.__init__ to accept a context parameter (containing
formatter, format, sort, skip_nulls, allow_dots, encode_dot_in_keys, charset,
strict_null_handling, serialize_date, allow_empty_lists, generate_array_prefix,
comma_compact_nulls, comma_round_trip, encoder, etc.) and remove those
individual assignments from the frame, leaving only truly per-frame state
(value, is_undefined, prefix, path, depth, max_depth, cycle_state, cycle_level,
index/step/obj-related fields); ensure code paths that previously constructed
child frames copy or reuse the same _EncodeContext instance when creating new
_EncodeFrame objects so behavior is identical but allocations are reduced.
- Around line 540-553: Introduce module-level integer constants (e.g.,
_PHASE_START = 0, _PHASE_ITERATE = 1, _PHASE_AWAIT_CHILD = 2) and replace all
string-literal phase checks and assignments around the main while stack loop
with these constants (e.g., change if frame.phase == "start" to if frame.phase
== _PHASE_START). Update any places that set frame.phase (or compare it
elsewhere) to use the new constants, and keep existing behavior for related
symbols such as frame.generate_array_prefix/_COMMA_GENERATOR,
frame.depth/_get_max_encode_depth, and KeyPathNode.from_materialized; ensure no
other code relies on the string values (update callers/tests if needed) so the
phase dispatch uses the integer constants consistently.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2702277 and 1c73320.

📒 Files selected for processing (2)
  • CHANGELOG.md
  • src/qs_codec/encode.py
✅ Files skipped from review due to trivial changes (1)
  • CHANGELOG.md

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

@techouse
Copy link
Owner Author

@coderabbitai resume

@techouse techouse requested a review from Copilot February 25, 2026 17:37
@coderabbitai
Copy link

coderabbitai bot commented Feb 25, 2026

✅ Actions performed

Reviews resumed.

@codacy-production
Copy link

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
+0.00% (target: -1.00%) 100.00%
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (3c7db66) 1436 1436 100.00%
Head commit (56dcd84) 1585 (+149) 1585 (+149) 100.00% (+0.00%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#43) 234 234 100.00%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

@codacy-production
Copy link

codacy-production bot commented Feb 25, 2026

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
+0.00% (target: -1.00%) 100.00%
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (3c7db66) 1436 1436 100.00%
Head commit (9dcc65d) 1588 (+152) 1588 (+152) 100.00% (+0.00%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#43) 240 240 100.00%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
src/qs_codec/models/key_path_node.py (1)

70-96: materialize depth-2 fast path doesn't populate parent.materialized.

At lines 82–84, the depth-2 fast path reads parent.materialized or parent.segment but never writes parent.materialized if it's unset. The parent will get its cache filled only if materialize() is called on it directly. While this is functionally correct (repeated calls to the parent will still produce the right result), it means the parent cache misses one possible fill opportunity, so a sibling node at depth 2 will re-compute the same parent_part string again.

This is an internal performance nit. Consider eagerly populating parent.materialized in the depth-2 fast path:

♻️ Proposed improvement
         if self.depth == 2:
-            parent_part = parent.materialized if parent.materialized is not None else parent.segment
+            parent_part = parent.materialized
+            if parent_part is None:
+                parent_part = parent.segment
+                parent.materialized = parent_part
             value = parent_part + self.segment
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/models/key_path_node.py` around lines 70 - 96, The depth-2 fast
path in materialize() reads parent.materialized or parent.segment but doesn't
populate parent.materialized when it's None; update the depth==2 branch in
materialize (on KeyPathNode) to eagerly set parent.materialized = parent.segment
when parent.materialized is None before computing value, so the parent cache is
filled for subsequent sibling calls and avoids redundant work.
src/qs_codec/encode.py (3)

172-179: _MISSING is defined after encode() which references it.

Python resolves global names at call time so this is correct at runtime, but defining _MISSING below the function that uses it (line 107) is unconventional. Grouping module-level constants before the functions that reference them improves readability.

♻️ Suggested placement
+_MISSING = object()
+_PHASE_START = 0
+_PHASE_ITERATE = 1
+_PHASE_AWAIT_CHILD = 2
+
 # Unique placeholder used as a key within the side-channel chain to pass context down recursion.
 _sentinel: WeakWrapper = WeakWrapper({})
 MAX_ENCODING_DEPTH_EXCEEDED = "Maximum encoding depth exceeded"
-_PHASE_START = 0
-_PHASE_ITERATE = 1
-_PHASE_AWAIT_CHILD = 2

And remove the _MISSING = object() from line 172.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/encode.py` around lines 172 - 179, The module-level sentinel
_MISSING is declared after the encode() function that references it; move the
single definition of _MISSING so it appears with the other top-level constants
(near _sentinel, MAX_ENCODING_DEPTH_EXCEEDED,
_PHASE_START/_PHASE_ITERATE/_PHASE_AWAIT_CHILD) before the encode() definition,
and remove the trailing duplicate `_MISSING = object()` later in the file so
encode() and other callers reference the same upfront constant.

430-446: Custom generator fallback in _next_path_for_sequence silently drops path hierarchy.

When a user-supplied generate_array_prefix returns a child path that does not start with the materialized parent (lines 444–446), the code falls back to KeyPathNode.from_materialized(child) — a flat root node with no parent link. This discards the accumulated path chain, meaning subsequent as_dot_encoded() / materialize() calls on descendants operate only on the suffix, not on the full hierarchical path. The behavior is still functionally correct for the generated key string, but the hierarchy loss may cause incorrect dot-encoding of ancestor segments when encode_dot_in_keys=True.

Consider adding an inline comment acknowledging this limitation:

♻️ Suggested documentation
     parent = path.materialize()
     child = generator(parent, encoded_key)
     if child.startswith(parent):
         return path.append(child[len(parent) :])
-    return KeyPathNode.from_materialized(child)
+    # Custom generator produced a path unrelated to `parent`; fall back to a
+    # materialised root node.  Dot-encoding of ancestor segments is not
+    # applied in this branch.
+    return KeyPathNode.from_materialized(child)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/encode.py` around lines 430 - 446, In _next_path_for_sequence,
when a custom generator (passed as generator / generate_array_prefix) returns a
child string that does not start with the materialized parent, the code falls
back to KeyPathNode.from_materialized(child) which intentionally produces a
root-level node and discards the existing parent chain; add a clear inline
comment next to that fallback (in the _next_path_for_sequence function, around
the return KeyPathNode.from_materialized(child) lines) explaining this
limitation, that the hierarchy is lost so descendant
materialize()/as_dot_encoded() will not include ancestor segments (notably when
encode_dot_in_keys=True), and note this is deliberate behavior for custom
generators so callers should return paths prefixed by the parent if they wish to
preserve ancestry.

540-788: Phase dispatch uses sequential if blocks instead of if/elif/elif.

Lines 543, 678, and 781 each check the frame phase with independent if statements. Since every _PHASE_START and _PHASE_ITERATE path ends with continue, there is no fall-through — the code is correct. However, using elif for _PHASE_ITERATE and the final guard would communicate mutual exclusivity and avoid evaluating subsequent comparisons on the hot path.

♻️ Proposed refactor
-        if frame.phase == _PHASE_START:
+        if frame.phase == _PHASE_START:  # noqa: SIM102  (intentional fall-through-free structure)
             ...
             continue

-        if frame.phase == _PHASE_ITERATE:
+        elif frame.phase == _PHASE_ITERATE:
             ...
             continue

-        if frame.phase != _PHASE_AWAIT_CHILD:  # pragma: no cover - internal invariant
+        elif frame.phase != _PHASE_AWAIT_CHILD:  # pragma: no cover - internal invariant
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/qs_codec/encode.py` around lines 540 - 788, The dispatch on frame.phase
uses three separate if statements (checking frame.phase == _PHASE_START, ==
_PHASE_ITERATE, and != _PHASE_AWAIT_CHILD) which is correct functionally but
wastes checks on the hot path; change the second and third independent ifs into
elif and else branches respectively so the code reads: if frame.phase ==
_PHASE_START: ... elif frame.phase == _PHASE_ITERATE: ... else: (guard for
unexpected phase / treat as _PHASE_AWAIT_CHILD). Update the branch conditions
around the symbols frame.phase, _PHASE_START, _PHASE_ITERATE, and
_PHASE_AWAIT_CHILD to reflect mutual exclusivity while preserving all existing
continue/pop/stack behavior and runtime checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/qs_codec/encode.py`:
- Around line 172-179: The module-level sentinel _MISSING is declared after the
encode() function that references it; move the single definition of _MISSING so
it appears with the other top-level constants (near _sentinel,
MAX_ENCODING_DEPTH_EXCEEDED, _PHASE_START/_PHASE_ITERATE/_PHASE_AWAIT_CHILD)
before the encode() definition, and remove the trailing duplicate `_MISSING =
object()` later in the file so encode() and other callers reference the same
upfront constant.
- Around line 430-446: In _next_path_for_sequence, when a custom generator
(passed as generator / generate_array_prefix) returns a child string that does
not start with the materialized parent, the code falls back to
KeyPathNode.from_materialized(child) which intentionally produces a root-level
node and discards the existing parent chain; add a clear inline comment next to
that fallback (in the _next_path_for_sequence function, around the return
KeyPathNode.from_materialized(child) lines) explaining this limitation, that the
hierarchy is lost so descendant materialize()/as_dot_encoded() will not include
ancestor segments (notably when encode_dot_in_keys=True), and note this is
deliberate behavior for custom generators so callers should return paths
prefixed by the parent if they wish to preserve ancestry.
- Around line 540-788: The dispatch on frame.phase uses three separate if
statements (checking frame.phase == _PHASE_START, == _PHASE_ITERATE, and !=
_PHASE_AWAIT_CHILD) which is correct functionally but wastes checks on the hot
path; change the second and third independent ifs into elif and else branches
respectively so the code reads: if frame.phase == _PHASE_START: ... elif
frame.phase == _PHASE_ITERATE: ... else: (guard for unexpected phase / treat as
_PHASE_AWAIT_CHILD). Update the branch conditions around the symbols
frame.phase, _PHASE_START, _PHASE_ITERATE, and _PHASE_AWAIT_CHILD to reflect
mutual exclusivity while preserving all existing continue/pop/stack behavior and
runtime checks.

In `@src/qs_codec/models/key_path_node.py`:
- Around line 70-96: The depth-2 fast path in materialize() reads
parent.materialized or parent.segment but doesn't populate parent.materialized
when it's None; update the depth==2 branch in materialize (on KeyPathNode) to
eagerly set parent.materialized = parent.segment when parent.materialized is
None before computing value, so the parent cache is filled for subsequent
sibling calls and avoids redundant work.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1c73320 and 56dcd84.

📒 Files selected for processing (3)
  • src/qs_codec/encode.py
  • src/qs_codec/models/key_path_node.py
  • tests/unit/key_path_node_test.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/unit/key_path_node_test.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/qs_codec/encode.py`:
- Around line 675-678: The empty-list branch emits a raw materialized path token
which bypasses key formatting/encoding; change the assignment that now uses
frame.adjusted_path.append("[]").materialize() so it instead produces the token
via the same key-formatting/key-encoding pipeline used by other key-only
branches (i.e., call the shared formatter/encoder used elsewhere for path
tokens), then assign that encoded/formatted token into last_result after
stack.pop(); make sure to use the same helpers (the formatter/encoder functions
used by other branches) so reserved characters and custom key encoders are
applied.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 56dcd84 and c8e390c.

📒 Files selected for processing (3)
  • CHANGELOG.md
  • src/qs_codec/encode.py
  • src/qs_codec/models/key_path_node.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/qs_codec/models/key_path_node.py

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.

Improve encode hot paths, cycle-state helpers, phase dispatch, and key-path caching behavior.

Includes iterative dot-encoding in KeyPathNode to avoid deep recursion and related internal clarity updates.
Expand tests for key-path nodes, generator fallbacks, strict-null formatting, and callable-filter mutation safety.
@techouse techouse force-pushed the chore/optimizations-2 branch from c8e390c to 9dcc65d Compare February 25, 2026 18:42
@techouse techouse merged commit 8ea0bff into main Feb 25, 2026
23 checks passed
@techouse techouse deleted the chore/optimizations-2 branch February 25, 2026 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants